43 research outputs found
Programming Heterogeneous Parallel Machines Using Refactoring and Monte-Carlo Tree Search
Funding: This work was supported by the EU Horizon 2020 project, TeamPlay, Grant Number 779882, and UK EPSRC Discovery, Grant Number EP/P020631/1.This paper presents a new technique for introducing and tuning parallelism for heterogeneous shared-memory systems (comprising a mixture of CPUs and GPUs), using a combination of algorithmic skeletons (such as farms and pipelines), Monte–Carlo tree search for deriving mappings of tasks to available hardware resources, and refactoring tool support for applying the patterns and mappings in an easy and effective way. Using our approach, we demonstrate easily obtainable, significant and scalable speedups on a number of case studies showing speedups of up to 41 over the sequential code on a 24-core machine with one GPU. We also demonstrate that the speedups obtained by mappings derived by the MCTS algorithm are within 5–15% of the best-obtained manual parallelisation.Publisher PDFPeer reviewe
The HERMIT in the Tree
This paper describes our experience using the HERMIT tool- kit to apply well-known transformations to the internal core language of the Glasgow Haskell Compiler. HERMIT provides several mechanisms to support writing general-purpose transformations: a domain-specific language for strategic programming specialized to GHC's core language, a library of primitive rewrites, and a shell-style{based scripting language for interactive and batch usage. There are many program transformation techniques that have been described in the literature but have not been mechanized and made available inside GHC - either because they are too specialized to include in a general-purpose compiler, or because the developers' interest is in theory rather than implementation. The mechanization process can often reveal pragmatic obstacles that are glossed over in pen-and-paper proofs; understanding and removing these obstacles is our concern. Using HERMIT, we implement eleven examples of three program transformations, report on our experience, and describe improvements made in the process
Refactoring GrPPI:Generic Refactoring for Generic Parallelism in C++
Funding: EU Horizon 2020 project, TeamPlay (https://www.teamplay-xh2020.eu), Grant Number 779882, UK EPSRC Discovery, grant number EP/P020631/1, and Madrid Regional Government, CABAHLA-CM (ConvergenciA Big dAta-Hpc: de Los sensores a las Aplicaciones) Grant Number S2018/TCS-4423.The Generic Reusable Parallel Pattern Interface (GrPPI) is a very useful abstraction over different parallel pattern libraries, allowing the programmer to write generic patterned parallel code that can easily be compiled to different backends such as FastFlow, OpenMP, Intel TBB and C++ threads. However, rewriting legacy code to use GrPPI still involves code transformations that can be highly non-trivial, especially for programmers who are not experts in parallelism. This paper describes software refactorings to semi-automatically introduce instances of GrPPI patterns into sequential C++ code, as well as safety checking static analysis mechanisms which verify that introducing patterns into the code does not introduce concurrency-related bugs such as race conditions. We demonstrate the refactorings and safety-checking mechanisms on four simple benchmark applications, showing that we are able to obtain, with little effort, GrPPI-based parallel versions that accomplish good speedups (comparable to those of manually-produced parallel versions) using different pattern backends.Publisher PDFPeer reviewe
Fold-Unfold Transformations On State Monadic Interpreters
. In this paper we advocate the use of fold-unfold transformations for mastering the complexity of abstract machines intended for real implementations. The idea is to express the abstract machine as an interpreter in a purely functional language. The initial interpreter should be `obviously correct' (but might be inefficient -- we don't care at this point). Fold-unfold transformations are then used to remove inefficiencies in the interpreter/abstract machine. We illustrate this by deriving (the equivalent of) the E-scheme of the G-machine from (the equivalent of) the composition of C and the EVAL instruction. This is first done on a call-byname (tree reduction) interpreter. To model sharing and the graph manipulation that goes on in a real graph reduction implementation, we use state monads. We do the same transformation of the state monadic interpreter. It is much less straightforward to transform the state monadic interpreter, as we have to lean heavily on the laws of the state monad..
Le rôle du sommeil et du simple passage du temps dans la consolidation de l'apprentissage d'habiletés motrices
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal